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In this talk I discuss the general question of the portability of Molecular Dynamics codes for diffusive systems 
on parallel computers of the APE family. The intrinsic single precision arithmetics of the today available APE 
platforms does not seem to affect the numerical accuracy of the simulations, while the absence of integer addressing 
from CPU to individual nodes puts strong constraints on the possible programming strategies. Liquids can be 
very satisfactorily simulated using the "systolic" method. For more complex systems, like the biological ones at 
which we are ultimately interested in, the "domain decomposition" approach is best suited to beat the quadratic 
growth of the inter-molecular computational time with the number of elementary components of the system. The 
promising perspectives of using this strategy for extensive simulations of lipid bilayers are briefly reviewed. 



1. Introduction 

In simulating the behaviour of microscopic sys- 
tems Molecular Dynamics (MD) faces two major 
problems. One is the intrinsic limitation coming 
from the use of classical mechanics to describe 
the dynamics of the "elementary" components 
of the system 0. The second, not less impor- 
tant, is of practical nature and it has to do with 
the finiteness of computer resources available at 
any time. In recent investigations Q 1^ we ad- 
dressed the second of these questions, showing 
that parallel computers, particularly of the APE 
family can be successfully employed to speed 
up in a substantial way MD simulations of diffu- 
sive systems, i.e. of systems in which the "list" of 
atoms that have a non-negligible interaction with 
a given atom of the system changes with time. In 
a solid (or in Lattice Quantum Chromo-Dynamics 
- LQCD) the "list" is blocked and is assigned once 
for all at the beginning of the simulation. 

Static and dynamic properties of liquids can 
be adequately studied using the "systolic" 
method |^ . For the more complex case of lipid bi- 
layers we have developed new approaches, adapt- 
ing to these systems the general "domain decom- 
position" strategy. The latter has the virtue of 
leading to CPU times for the computation of the 
inter-molecular potential that (at constant den- 



sity) grow only linearly with the number of atoms. 



2. Liquid Butane 

As a first significant test case, we have stud- 
ied in great detail liquid butane {C4H1Q) ||], us- 
ing a standard Multiple Time Step (MTS) inte- 
gration algorithm |jl| (with a "long" integration 
time step Ai^ = 4 /s and a partition number 
Po ~ 8). The results of the simulation of the 
time evolution of a system of M = 512 molecules 
confirm that for a homogeneous diffusive system 
the "systolic" method is well suited for mas- 
sive parallel production. The method consists in 
computing the inter-molecular (Lennard-Jones) 
interactions between atoms belonging to different 
molecules, by first democraticaly distributing in 
bunches among the N nodes of the machine the 
molecules of the system. Coordinates and mo- 
menta of the N bunches of molecules are copied 
in transitory arrays and circulated through nodes. 
By bringing them successively in contact with the 
node-residing molecules, the crossed interaction 
terms are all computed in iV — 1 moves. The 
method does not beat the growth of the inter- 
molecular computational time, but decreases it by 
a factor equal to the number of nodes. 

On the 512-nodes APE configuration (Torre) it 
took in all 450 hours of CPU time to collect the 
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whole statistics of 10 ns presented in g]. Com- 
puter times of this size are certainly within the 
current standards of MD and LQCD simulations. 
For comparison, a system of 256 molecules has 
been simulated, in double precision, on a DIG- 
ITAL 200 4/233 a-station. To give an approxi- 
mate reference figure we may quote a factor of 50, 
as a gain in speed in going from the a-station to 
the Torre. 

Our code was written in TAO, the APE highest 
level language. However APE simulation times 
can be substantially reduced (by a factor from 
3 to 4), if the APE- assembler micro-code of the 
most time-consuming part of the program, where 
the inter-molecular forces are computed, is prop- 
erly optimized. 
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Figure 1. The time evolution of the total energy 
of a system of 512 molecules of liquid butane. 



Very good agreement is obtained among the 
various simulations. Our results also com- 
pare nicely with the available experimental 
data 1^ 1^ and with the pioneering simula- 
tion of ref . 07 where 64 molecules of butane were 
followed for up to 20 ps (after few ps of equilibra- 
tion). 

In our very long simulations we did not see 
signs of instabilities nor of any systematic drift 
in the value of the total energy of the system (see 
Fig. Single precision rounding problems were 
sometimes held responsible for pathologies of that 
kind. 



3. Towards simulating realistic membranes 

Simulating realistic cell membranes and study- 
ing their interaction with small peptides (mim- 
icking pharmaceuticals of possible therapeutical 
interest) is of the utmost importance, if not for 
immediate medical use, certainly for the devel- 
opment of new conceptual and practical tools in 
MD applications to biological systems. A lot of 
work has gone in this direction (see e.g. |l^ ||l^ 
and references therein), but we are still far from 
having a viable tool-kit for immediate practical 
use. 

Schematically a cell membrane is constituted 
by an almost spherical bilayer of phospho-lipidic 
molecules, separating the interior of the cell from 
the external world. Various kinds of peptides 
and proteins, responsible for the biochemical pro- 
cesses necessary for the life and the functionality 
of the cell, are plugged into the membrane. 

Phospho-lipids are large Y-shaped molecules 
(with more than 30 atoms, not counting carbon 
bound hydrogens) with a hydrophilic head and 
two hydrophobic tails. This hydropathicity con- 
figuration lead to a well defined bilayer 3-D struc- 
ture: the hydrophilic heads are in contact with 
water, present both outside and inside the cell, 
while the hydrophobic tails are more or less back- 
to-back pair-wise aligned. 

An important parameter governing the reac- 
tion rate of many biological processes taking place 
in the membrane is its "permeability" . In this re- 
spect a membrane can be regarded as a liquid 
crystal, with a permeability which depends "crit- 
ically" on the temperature, on the detailed chem- 
ical composition of the constituent phospho-lipids 
and on the concentration of chemicals dispersed 
in the membrane itself or in the solvent. 

Even from this very crude picture, it is clear 
that a detailed simulation of the dynamics of the 
membrane of a living cell is just impossible and 
we have to resort to a number of simplifications. 
As it appears experimentally that the nature and 
the location of the phase transitions, which con- 
trol the physico-chemical properties of the mem- 
brane, are related to the bulk ordering properties 
of the hydrophobic tails, a first step in the di- 
rection of simulating a realistic system is to take 
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a sufficiently large bilayer in a acqueous medium 
and begin to study 1) the behaviour of the rele- 
vant order parameters as functions of the temper- 
ature, 2) the dependence of the position of critical 
points upon the concentration of small intramem- 
brane peptides. 

We have started our investigation with a sys- 
tem of 2 X 256 Dimyristoyl-phosphatidylcholine 
(DMPC) molecules (each molecule is composed 
by 37 atoms) in vacuum, neglecting in these first 
trial simulations Coulomb interactions^. We have 
run the dynamics of the system at various tem- 
peratures on the Torre. In each run the history of 
the system was followed for several hundreds ps 
(plus equilibration). At very low temperatures 
(T < 200 K) the system appears to be stable, 
although we know that, lacking Coulomb inter- 
actions and in absence of solvent, it is actually 
unstable and expected to "explode" at higher T. 

Already in this oversimplified test case CPU 
simulation times are exceedingly large, as the 
computation of the inter-molecular forces require 
the evaluation of a daring (2 x 256 x 37)^/2 terms! 
Since the vast majority of them gives a negligi- 
bly small contribution to the forces (the Lennard- 
Jones potential decreases very fast with the dis- 
tance), the obvious way to cope with this problem 
is to avoid computing the very many effectively 
irrelevant terms. To this end the physical space in 
which the system lives is first decomposed into N 
domains, each one attributed to one of the nodes 
of the machine. The domains are in turn subdi- 
vided in cells. The number of cells in each domain 
and, hence, the spatial extension of each cell, is 
chosen so that the interaction between atoms re- 
siding in non-nearest-neighboring cells is negligi- 
bly small. Then inter-molecular interactions are 
computed only between the atoms of a cell and 
those of the 26/2 nearest neighboring ones. Ev- 
ery N remap integration steps, the coordinates of 
all the molecules are cross-checked node by node 
and, if necessary, molecules that have wandered 
away from the original cell are reassigned to the 
cell to which they came up to belong. The remap- 
ping of the system costs a time which only growth 

^ Special care must be exerced to deal with Coulomb inter- 
action. I will not discuss this problem here. The interested 
reader can have a look to WM and references therein. 



linerly with the number of particles. 

A problem with this approach on APE plat- 
forms is the lack of integer addressing to individ- 
ual nodes, which makes impossible to assign lo- 
cally to each node the set of indices representing 
the number of interaction terms to be computed 
between pairs of cells. This difficulty has been 
overcome by assigning to all nodes the same set 
of indices, namely the set of the largest values 
taken by them throughout the machine. Nodes 
with fewer than the maximal number of terms to 
be computed will wait until other nodes have fin- 
ished their job. 
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Figure 2. a-station, Cuhetto and Torre CPU 
times 



We have fully implemented these ideas on APE 
computers in the case of butane, obtaining the 
expected linear behaviour with the number, n, 
of interacting particles and a perfect scalability 
with the number of nodes in going from the Cu- 
hetto (N = 2^) to the Torre {N = 2^). CPU times 
for simulating the dynamics of a system of 2048 
molecules of butane for 1 ns with our standard 
MTS algorithm were measured on the Digital 200 
4/233 a-station, the Cubetto and the Torre, ob- 
taining 

T^^^ = 2.3 • 10-4 [2.3n + i^y] days 
TCubetto ^ 2.3 • 10-3 [3.4n + 'j^^n] days (1) 
jVTorre ^ 2.3 • IQ-^ [bAji + ly;^"] days 

Eqs. Jl with Nremap = N^pdate = 40 are plotted 
in Fig. g as functions of n. Notice that in the case 
of the a-station we have used the "list" method 
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to speed up the simulation. As the list updat- 
ing time growths quadratically with n, for a suf- 
ficiently large number of atoms the "list" method 
curve will always cross the "domain decomposi- 
tion" straight line. 
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